Statistical Section Segmentation in Free-Text Clinical Records
نویسندگان
چکیده
Automatically segmenting and classifying clinical free text into sections is an important first step to automatic information retrieval, information extraction and data mining tasks, as it helps to ground the significance of the text within. In this work we describe our approach to automatic section segmentation of clinical records such as hospital discharge summaries and radiology reports, along with section classification into pre-defined section categories. We apply machine learning to the problems of section segmentation and section classification, comparing a joint (one-step) and a pipeline (two-step) approach. We demonstrate that our systems perform well when tested on three data sets, two for hospital discharge summaries and one for radiology reports. We then show the usefulness of section information by incorporating it in the task of extracting comorbidities from discharge summaries.
منابع مشابه
Automatic Segmentation of Clinical Texts - Preliminary Results
Clinical narratives, such as radiology and pathology reports, are commonly available in electronic form. However, they are also commonly entered and stored as free text, and knowledge of their structure is necessary for enhancing the productivity of the healthcare departments and facilitating research. This paper presents a preliminary study attempting to automatically segment medical reports i...
متن کاملMaintenance of a Computerized Medical Record Form
Structured entry forms for clinical records should be updated to take into account the physicians' needs during consultation and advances in medical knowledge and practice. We updated the computerized medical record form of a hypertension clinic, based on its previous use and clinical guidelines. A statistical analysis of previously completed forms identified several unnecessary items rarely us...
متن کاملA Framework for Clustering Massive Text and Categorical Data Streams
Many applications such as news group filtering, text crawling, and document organization require real time clustering and segmentation of text data records. The categorical data stream clustering problem also has a number of applications to the problems of customer segmentation and real time trend analysis. We will present an online approach for clustering massive text and categorical data stre...
متن کاملA Pragmatic Approach to Summary Extraction in Clinical Trials
ClinicalTrials.gov, the National Library of Medicine clinical trials registry, is a monolingual clinical research website with over 29,000 records at present. The information is presented in static and free-text fields. Static fields contain high-level informational text, descriptors, and controlled vocabularies that remain constant across all clinical studies (headings, general information). F...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012